Search CORE

5 research outputs found

Recommended from our members

Learning nonlocal phonotactics in Strictly Piecewise phonotactic model

Author: Dai Huteng
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2021
Field of study

Phonotactic learning is a crucial aspect of phonological acquisition and has figured significantly in computational research in phonology (Prince & Tesar 2004). However, one persistent challenge for this line of research is inducing non-local co-occurrence patterns (Hayes & Wilson 2008). The current study develops a probabilistic phonotactic model based on the Strictly Piecewise class of subregular languages (Heinz 2010). The model successfully learns both segmental and featural representations, and correctly predicts the acceptabilities of the nonce forms in Quechua (Gouskova & Gallagher 2020; G & G henceforth)

ScholarWorks@UMass Amherst

Learning Nonlocal Phonotactics in a Strictly Piecewise Probabilistic Phonotactic Model

Author: Dai Huteng
Publication venue: 'Linguistic Society of America'
Publication date: 01/05/2021
Field of study

Proceedings Published by the LSA (Linguistic Society of America)

Recommended from our members

Information-theoretic Characterization of the Sub-regular Hierarchy

Author: Dai Huteng
Futrell Richard
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2020
Field of study

Our goal is to link two different formal notions of complexity: the complexity classes defined by Formal Language Theory (FLT)—in particular, the Sub-regular Hierarchy (Rogers et al., 2013; Lai, 2015; Heinz, 2018)—and Statistical Com- plexity Theory (Feldman and Crutchfield, 1998; Crutchfield and Marzen, 2015). The motivation for exploring this connection is that factors involving memory resources have been hypothesized to explain why phonological processes seem to inhabit the Sub-regular Hierarchy, and Statistical Complexity Theory gives an information-theoretic characterization of memory use. It is currently not known whether statistical complexity and FLT define equivalent complexity classes, or whether statistical complexity cross-cuts the usual FLT hierarchies. Our work begins to bridge the gap between FLT and Information Theory by presenting characterizations of certain Sub-regular languages in terms of statistical complexit

ScholarWorks@UMass Amherst

Learning Phonotactics in a Differentiable Framework of Subregular Languages

Author: Dai Huteng
Futrell Richard
Publication venue: 'Linguistic Society of America'
Publication date: 05/08/2022
Field of study

Phonotactic constraints have been argued to beregular, meaning that they can be represented usingfinite-state automata (Heinz, 2018); furthermore, they have been argued to occupy a even more restrictedregion of the regular language class known as the subregular hierarchy (Rogers & Pullum, 2011). Ourcontribution is to present a simple model of phonotactic learning from positive evidence. Our approach isbased on probabilistic finite-state automata (Vidal et al., 2005a,b). We study the model’s ability to induce localand nonlocal phonotactics from wordlist data, both with and without formal constraints on the automaton.In particular, we evaluate the ability of our learner to induce nonlocal phonotactic constraints from data ofNavajo and Quechua. Our work provides a framework in which different formal models of phonotactics canbe compared, and sheds light on the structural nature of phonological acquisition (Dai, 2021; Shibata & Heinz,2019; Heinz & Rogers, 2010, 2013)

Proceedings Published by the LSA (Linguistic Society of America)

Recommended from our members

Rethinking Representations: A Log-bilinear Model of Phonotactics

Author: Dai Huteng
Futrell Richard
Mayer Connor
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/06/2023
Field of study

Models of phonotactics include subsegmental representations in order to generalize to unattested sequences. These representations can be encoded in at least two ways: as discrete, phonetically-based features, or as continuous, distribution-based representations induced from the statistical patterning of sounds. Because phonological theory typically assumes that representations are discrete, past work has reduced continuous representations to discrete ones, which eliminates potentially relevant information. In this paper we present a model of phonotactics that can use continuous representations directly, and show that this approach yields competitive performance on modeling experimental judgments of English sonority sequencing. The proposed model broadens the space of possible phonotactic models by removing requirements for discrete features, and is a step towards an integrated picture of phonotactic learning based on distributional statistics and continuous representations

ScholarWorks@UMass Amherst